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by 
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ABSTRACT 

In  this  paper,  the  origin  of  selection  and 
ranking  problems  is  discussed.  Then  the  two  basic 
approaches  to  the  selection  problem  - the  indiffer- 
ence zone  approach  and  the  subset  selection  approach 
- are  reviewed  briefly.  As  an  application,  Gupta's 
subset  selection  procedure  is  applied  to  motor- 
vehicle  fatality  data  which  fits  into  a two-way  lay- 
out. 


I.  INTRODUCTION  AND  ORIGIN  OF  THE  PROBLEM 

A comon  problem  faced  by  an  experimenter  is 
one  of  comparing  several  categories  or  populations. 
These  may  be,  for  example,  different  varieties  of  a 
grain,  different  competing  manufacturing  processes 
for  an  industrial  product,  different  drugs  (treat- 
ments) for  a specific  disease,  or  different  alterna- 
tives under  which  a simulated  system  is  run.  In 
other  words,  we  have  k (>^  2)  populations  and  each 
population  is  characterized  by  the  value  of  a param- 
eter of  interest  6,  which  may  be,  in  the  example  of 
drugs,  an  appropriate  measure  of  the  effectiveness 
of  a drug.  The  classical  approach  to  this  problem 
is  to  test  the  hypothesis  H^:  -...=  Oj^,  where 

are  the  values  of  the  parameter  for  these 

populations.  In  the  case  of  normal  populations  with 

2 


means  e^,...,0|^  and  a common  variance  o 


the  test 


can  be  carried  out  using  the  F-ratio  of  the  analysis 
of  variance. 

The  above  classical  approach  is  inadequate  and 
unrealistic  in  the  sense  that  it  often  cannot  answer 
the  experimenter's  real  questions,  such  as,  how  to 
identify  the  best  category?  Often  in  practice, 
after  the  hypothesis  H^;  Bj  »...*  0|^  has  been 

rejected,  one  of  the  multiple-comparison  procedures 
designed  for  making  inferences  concerning  all  pair- 
wise differences  of  or  all  linear  contrasts  of 

Is  employed,  and  based  on  Its  outcome  some  pur- 
ported 'best'  set  of  populations  Is  chosen.  But 
this  method  of  choosing  a 'best'  set  of  populations 
Is  Indirect  and  does  not  control  any  error  rate 
relevent  to  the  problem,  for  example,  the  probabil- 
ity of  an  Incorrect  selection. 


II.  TWO  APPROACHES  TO  THE  SELECTION  PROBLEM 

The  formulation  of  a k-sample  problem  as  a 
selection  and  ranking  problem  enables  the  experimen- 
ter to  answer  his  natural  questions  regarding  the 
best  category.  There  are  two  basic  approaches  to 
the  problem  of  selection.  The  first  approach  Is 
what  is  known  as  the  indifference  zone  approach  In- 
troduced by  Bechhofer  in  [1].  The  second  approach 
is  the  subset  selection  approach  introduced  by  Gupta 
In  [6]. 

In  order  to  explain  the  two  approaches,  con- 
sider the  problem  of  selecting  the  population  with 
the  largest  mean  from  k normal  populations  with 
unknown  means  u.,  i - l,...,k,  and  a common  known 
2 ' . 

variance  o . Let  x^,  i = l,...,k,  denote  the  sample 

means  of  ’--‘‘•''’ndent  samples  of  size  n from  these 
populat'  •>-  te  'natural'  procedure  is  to  select 
the  por  that  yields  the  largest  x^.  The 

experinic  ..  luld,  of  course,  want  a guarantee  that 
this  procedure  will  pick  the  population  with  the 
largest  w'.th  a probability  not  less  than  a speci- 
fied level  P*.  For  the  problem  to  be  meaningful,  P* 
should  be  between  1/k  and  1.  Since  we  do  not  know 
the  true  configuration  of  the  u^,  we  look  for  the 

least  favorable  configuration  (LFC)  for  which  the 
probability  of  a correct  selection  (PCS)  Is  at  a 
minimum.  Without  restrictions  on  the  u.,  i>l,...,k, 

the  LFC  is  given  by  uj  *...*  for  which  the  proba- 
bility guarantee  cannot  be  met,  whatever  the  sample 
size  n. 

A natural  modification  is  to  insist  on  the  min- 
imum probability  guarantee  whenever  the  best  popula- 
tion is  sufficiently  superior  to  the  next  best.  In 
other  words,  the  experimenter  specifies  a positive 
constant  a*  and  requires  PCS  to  be  at  least  P*  when- 

‘'[k]  ' “[k-1]  - ''*•  •'[1]  -•••-  “[k] 

denote  the  ordered  means.  So  the  minimization  of 
PCS  Is  over  the  part  of  the  parameter  space  In 

which  1 a*.  The  conplencnt  of  Is 

Is  called  the  Indifference  zone  for  the  obvious  rea- 
son. The  problem  Is  to  determine  the  minimum  sample 
size  n required  In  order  to  achieve  PCS  > P*  for  the 
LFC.  This  approach  is  known  as  the  Indifference 
zone  approach. 
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In  the  subset  selection  approach,  the  goal  Is 
to  select  a non-envty  subset  of  tlie  populations  so 
as  to  include  the  best  population.  Here  the  size  of 
the  selected  subset  is  not  fixed  in  advance,  but 
rather  is  determined  by  the  observations  themselves. 
For  our  example  of  normal  populations  with  unknown 
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means  and  common  known  variance  a > the 

rule  proposed  by  Gupta  in  [6]  selects  the  population 
that  yields  x,  if  and  only  if  x.  > max  x.-d.o/i^, 

’ ’ l^<k  J ' 

where  d^  ■ d^(k,P*)  > 0 is  determined  so  that  the 

PCS  is  at  least  P*.  The  constant  dj  is  determined 


/ ♦*''{t+d,)d«{t)  « P* 


where  ♦ is  the  cumulative  distribution  function  of  a 
standard  normal  variable.  Tables  for  d^  for  selec- 
ted values  of  k and  P*  are  available  in  Gupta,  . 
Nagel,  and  Panchapakesan  [9].  In  the  case  where 
is  common  but  unknown,  Gupta's  procedure  is  to 
select  the  population  that  yields  if  and  only  if 

X,  > max  X,  - d^s/v'n  where  s^  is  the  usual  pooled 

estimate  of  c based  on  v « k(n-1)  degrees  of  free- 
dom. d2  • d2(k,v,P*)  is  chosen  to  satisfy  the  P* 

condition  and  is  determined  by 


IL'' 


'(t+d2u)da(t)dQ^(u)  - P* 


where  * is  as  before  and  is  the  distribution 
function  of  Xy/>^.  For  selected  values  of  P*,  k and 
V the  values  of  d2  were  tabulated  by  Gupta  and  Sobel 
in  111]. 

It  should  be  pointed  out  that  the  two 
approaches,  namely,  indifference  zoi.'C  and  subset 
selection,  differ  in  that  the  former  requires  speci- 
fication of  two  constants  P*  and  a*  to  select  a 
fixed  number  t,  say,  of  populations;  the  later 
(subset  selection)  requires  only  one  constant, 
namely,  P*  to  be  specified  and  selects  a random  size 
subset  depending  on  the  outcome  of  the  experiment. 

Performance  of  subset  selection  procedures  can 
be  discussed  in  terms  of  true  probability  of  a cor- 
rect selection,  expected  subset  size,  expected  pro- 
portion selected,  and  other  similar  quantities.  A 
number  of  performance  studies  have  been  carried  out, 
see,  for  example,  Gupta  [7]  and  Deely  and  Gupta  [4]. 
Gupta  and  Panchapakesan  [10]  gave  a comprehensive 
account  of  the  relevant  work  in  the  area  up  to  that 
time.  Since  then  progress  has  been  made  in  several 
directions.  Oudcwicz  and  Dalai  [5]  considered, 
among  other  things,  two  stage  procedures  for  the 
normal  means  problem  with  unknown  and  unequal  vari- 
ances. Gupta  and  Huang  [Q]  also  considered  the  nor- 
mal means  problem  with  unequal  variances.  In  his 
Ph.O.  thesis  [2],  Berger  considered  the  minimaxity 


and  admissability  of  subset  selection  procedures. 

Two  recent  Monte  Carlo  studies  by  Chernoff  and  Yahav 
[3]  and  Hsu  [12]  showed  that  for  the  normal  means 
problem  discussed,  the  class  of  Gupta's  normal  means 
procedures  are  nearly  optimal  in  the  sense  that  with 
respect  to  normal  priors,  their  integrated  risks  are 
close  to  those  of  Bayes  procedures. 

In  the  next  section  we  shall  illustrate  the 
use  of  subset  procedures  by  applying  the  method  Just 
described  to  traffic  fatality  data. 


III.  AN  ANALYSIS  OF  MOTOR-VEHICLE  FATALITY  DATA 

In  McDonald  [13],  the  use  of  nonparametric  sub- 
set selection  procedures  is  illustrated  by  the 
application  of  these  procedures  to  a set  of  traffic 
fatality  data.  For  comparison  purposes,  we  shall 
use  the  same  data  set.  We  are  indebted  to  Dr. 
McDonald  for  allowing  us  to  use  this  data  and  for 
suggesting  the  useful  transformation  used  subse- 
quently. 

The  traffic  fatality  data  used  in  McDonald  [13] 
are  motor-vehicle  traffic  fatality  rates  (MFR)  for 
the  forty-eight  contiguous  states  and  the  District  of 
Columbia  for  the  years  I960  to  1976.  See  Table  1 of 
McDonald  [13].  It  would  be  of  interest  to  select  out 
those  states  that  have  MFR  much  higher  or  lower  than 
average.  Further  investigation  of  these  states  might 
identify  factors  related  to  MFR.  We  shall  illustrate 
the  use  of  subset  selection  procedures  by  selecting 
a set  of  'best'  populations  and  a set  of  'worst'  pop- 
ulations. 

Let  Xjj  denote  the  MFR  for  the  i^  state  and  the 

jth  year,  i = l,...,4g,  j = 1,...,17.  The  index  i 
denotes  the  state  in  alphabetic  order  and  the  index 
j denotes  the  year  in  increasing  order.  Our  goal  is 
roughly  to  select  the  states  having  the  lowest 
(highest)  "average"  MFR.  For  an  appropriate  model 
we  consider  the  two-way  layout: 

Xij«m+ai+bj+(ab)ij+eij,  1*1,. ..,49,  j=l,...,17  (1) 


I a.*0,  j b^'O,  j (ab),.«0,  J'  (ab).,-0 
« jSl  J iSi  jlii 


by  are  independently  distributed  with  means  0. 

Our  goal,  stated  in  terms  of  the  model  (1),  1$  as 
follows; 

Goal  1:  Select  the  states  having  the  smallest 
(largest)  m^a^. 

Note  that  our  model  is  the  fixed-effect  model. 
The  factor  'year'  is  not  considered  to  be  a random 
factor  since  it  can  be  observed  from  the  data  that 
from  around  1968,  there  has  been  a general  decreas- 
ing trend  in  the  fatality  rate. 


Uslny  trAdltlona)  analysis  of  variance  tech- 
niques, one  would  first  test  the  hypothesis 
H:  (ah)^j  > 0 for  all  1,J.  If  this  hypothesis  Is 

accepted,  one  would  proceed  to  test  whether  each  of 
the  main  effects  Is  significant.  However,  our  goal 
Is  the  stated  Goal  1.  We  are  not  particularly  In- 
terested in  whether  the  main  effects  are  signifi- 
cant. 

In  order  to  achieve  our  goal.  Intuitively  we 
need  to  have  good  estimates  of  the  a^'s.  We  also 

need  to  have  estimates  of  the  variances  of  these 
estimates.  For  the  latter  it  is  generally  necessary 
to  have  * 0 for  all  i,j.  Therefore,  Tukey's 

test  for  additivity  was  run  to  test  the  hypothesis 
Hq:  ° 0 for  all  i,j.  'Jnfortunately  the  test 

rejected  the  null  hypothesis.  However,  it  is  often 
possible  to  transform  the  data  so  that  the  interac- 
tion term  for  the  transformed  data  is  statistically 
insignificant.  For  this  MFR  data,  the  monotone 
transformation  • fin(Xjj-l)  appears  to  be  such  a 

transformation.  Tukey's  test  on  the  transformed 
data  showed  no  significance  against  the  hypothesis 
of  no  interaction.  (For  the  analysis  that  led  to 
the  choice  of  this  transformation,  see  McDonald 
[13].)  Thus,  for  the  transformed  data  the  following 
model  (2)  appears  to  be  reasonable: 

(2)  Y^j  * j • 


49 


17 

’0,  i 


c^j  are  independently  distributed  with  means  0. 

To  investigate  further,  tests  were  run  on  the  sample 

17 

residuals  ‘Y  j^Y"  where  y^  » 

49  49  17 

y 1 * I and  y ' I I y.  7(49x17).  Test  of 

•J  U ••  1,1  j=i  ’J 

homogeneity  of  variance  of  the  residuals  showed  no 
significance.  Against  the  hypothesis  that  the 
residuals  are  normally  distributed,  the  two-tailed 
Kolmogorov-Smirnov  test  rejects  at  the  5«  level  but 
not  at  the  U level.  See  Table  1.  Plotting  of  the 
residuals  on  nonnal  probability  paper  further 
reveals  that  the  distribution  of  the  residuals  dur- 
ing the  first  seven  years  has  a slightly  longer 
left-hand  tail  than  the  normal  distribution  while 
during  the  last  ten  years  it  is  essentially  nonnal. 
Thus  it  appears  not  unreasonable  to  assume  that  c., 

2 2 
are  independent  N(0,a  ) for  some  a . 

TABLE  1 

Kolmogorov-Smirnov  Goodness  of  Fit  Test 
Test  Dist.xNormal (mean*. 0000, std.  dev.«.S963) 


cases  max.  diff. 

833  -.0181 


2-tailed  P 
.9480 


Our  original  goal  is  to  select  those  states  i 
that  have  the  lowest  (highest)  E(X^  ) where  ■ 

17 

7 X.717.  Under  the  model  (2)  and  the  assumption 
J*1  2 

that  are  independent  N(0,n  ),  it  can  be  shown 

that  the  relative  ordering  among  E(Yj  ) where  Y^  • 
17  17 

)"  Y../17  » T to(X..-l)/l7  is  the  same  as  the 
i-1  ’J  i-1  ’4 

relative  ordering  among  £(X^  ).  This  amounts  to 

saying  that  the  'transformed*  mean  fatality  rates 
have  the  saii«  relative  ordering  as  the  orginal 
untransformed  mean  fatality  rates.  Hence,  we  can 
restate  our  original  goal  (Goal  1)  in  terms  of  the 
quantities  in  the  model  (2)  as 

Goal  2:  Select  the  states  having  the  smallest 
(largest)  u+a^. 

Before  we  apply  Gupta's  normal  means  procedure 
to  the  transformed  data,  some  comments  are  in  order. 
Ue  have  stated  earlier  that  from  the  Moite  Carlo 
studies  of  Chernoff  and  Yahav  [3]  and  Hsu  [12]  we 
know  that  in  the  normal  case  Gupta's  procedure  per- 
forms well.  Hence  for  Goal  2.  Gupta's  procedure 
will  have  good  performance.  But  the  transformation 
changed  the  scale  of  measurement  for  the  means  and 
substantially  changed  the  variances  of  the  relevant 
quantities.  One  might  question  whether  a procedure 
good  for  Goal  2 is  necessarily  good  for  Goal  1. 

Those  Monte  Carlo  studies  showed  that  Gupta's  pro- 
cedure is  good  for  a variety  of  loss  functions 
corresponding  to  the  general  ial  of  selecting  a 
subset  of  good  (bad)  populati.  is.  Hence  in  terms  of 
Goal  1,  applying  Gupta's  procedure  to  the  trans- 
formed data  should  give  good  results. 

To  apply  Gupta's  nonnal  means  procedure  we 

2 2 

shall  estimate  each  u+o^  by  y.  and  o by  s ■ 
Et(yij-yi  -y  j+y  )^/(48«16).  Table  2 lists  and 
the  corresponding  states  in  accending  order  of  y^  . 

The  calculated  value  of  s^  is  also  given.  From 
Gupta,  Nagel  and  Panchapakesan  [9],  the  d^  values 

corresponding  to  P*  • .90  in  3.651.  Therefore,  to 
select  a subset  of  states  such  that  with  probability 
.90  the  state  with  the  best  (lowest)  true  MFR  is 
included,  we  select  those  states  with  y^  < y^g  ♦ 

d^s/^  =■  .360  + 0.109  X 0.459.  Only  Rhode  Island  is 

selected.  To  select  a subset  of  states  such  that 
with  probability  .90  the  state  with  the  true  worst 
(highest)  MTR  is  included,  we  select  those  states 
with  y^  i Yjo  ■ <^2*'  ’^  ' • 1.668.  Six 

states  are  selected.  See  Table  2.  For  P*  » .99, 
for  the  set  of  'best'  populations,  again  only  Rhode 
Island  is  selected.  For  the  set  of  'worst'  popula- 
tions, ten  states  are  selected. 

Let  us  compare  Gupta's  normal  means  procedure 
with  McDonald's  rank  sum  procedure  (described  in 
detail  in  McDonald  [13]).  Tor  P*  • .90,  the  normal 
means  procedure  selects  six  states  as  'worst*  popu- 
lations while  the  rank  sum  procedure  R^  selects  ten. 

This  is  not  surprising  since  im  re  assumptions  are 
made  in  applying  the  normal  means  procedure,  hence 
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one  Is  able  to  obtain  stronger  results.  The  rank 
sum  procedure  R^‘  selects  twelve  states  as  'best' 

populations  while  the  normal  means  procedure  selects 
only  one.  This  may  seem  mildly  surprising  but  a 
careful  examination  of  the  basic  MFR  data  readily 
reveals  the  reason.  From  Table  1 of  McDonald  [13] 
one  sees  that  the  MFR  for  Rhode  Island  is  consis- 
tently much  smaller  than  average.  This  causes  the 
normal  means  procedure  to  select  that  state  alone. 
The  rank  sum  procedure  is  based  on  relative  ranks 
only.  It  is  designed  so  that  the  information  con- 
cerning the  magnitude  of  the  differences  In  the 
sample  is  ignored.  Hence  there  is  no  drastic 
reduction  in  the  number  of  states  selected  for  the 
rank  sum  procedure. 


IV.  CONCLUSION 


In  the  last  twenty-five  years,  research  in  the 
area  of  selection  and  ranking  procedures  has  pro- 
gressed steadily.  These  procedures  clearly  have 
great  potential  for  application  in  simulation 
studies  and  in  other  areas.  They  have  not  been 
used  more  perhaps  because  it  calls  for  giving  up  the 
ingrained  haoit  of  testing  of  hypothesis  on  the  part 
of  applied  statisticians.  In  view  of  the  fact  that 
some  optimality  properties  of  these  procedures  are 
becoming  known,  the  time  is  right  for  making  an 
effort  in  applying  these  procedures  In  practice. 


TABLE  2 

Selection  of  States  In  Terms  of  MFR 


1 state  y^ 


38  Rhode  Island 

.360 

6 Connecticut 

.538 

29  New  Jersey 

.667 

8 Dist.  of  Col. 

.775 

20  Massachusetts 

.899 

19  Maryland 

1.037 

37  Pennsylvania 

1.045 

28  New  Hampshire 

1.065 

18  Maine 

1.114 

7 Delaware 

1.130 

34  Ohio 

1.163 

46  Washington 

1.167 

4 California 

1.195 

12  Illinois 

1.198 

45  Virginia 

1.210 

21  Michigan 

1.218 

22  Minnesota 

1.231 

31  New  York 

1.245 

26  Nebraska 

1.251 

48  Wisconsin 

1.310 

IS  Kansas 

1.323 

13  Indiana 

1.333 

35  Oklahoma 

1.339 

14  Iowa 

1.374 

5 Colorado 

1.378 

9 Florida 

1.380 

33  North  Dakota 

1.389 

42  Texas 

1.390 

P*  » .90  P*  • .99 

Gupta  McDonald  Gupta 

• * • 

* 

* 

* 

* 

* 

* 

* 

* 

* 


1 


Table  2 ...  Continued 

P*  • .90  P*  ■ .99 


1 state 

>1. 

Gupta  McDonald  Gupta 

43  Utah 

1.395 

24  Missouri 

1.403 

36  Oregon 

1.415 

44  Vermont 

1.458 

16  Kentucky 

1.467 

41  Tennessee 

1.495 

* 

10  Georgia 

1.533 

* 

3 Arkansas 

1.546 

* 

40  South  Dakota 

1.561 

* 

47  West  Virginia 

1.580 

* 

32  North  Carolina 

1.625 

* 

49  Wyoming 

1.636 

* * 

2 Arizona 

1.636 

* * 

39  South  Carolina 

1.648 

« * 

1 Alabama 

1.651 

• # 

25  Montana 

1.691 

« * * 

11  Idaho 

1.698 

* • • 

17  Louisiana 

1.758 

* * * 

23  Mississippi 

1.773 

* * # 

27  Nevada 

1.775 

* * * 

30  New  Mexico 

1.777 

* * « 

s^  » 0.0152  s * 

0.123 

* denote  selected  state 
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